Goto

Collaborating Authors

 recursive inference


Recursive Inference for Variational Autoencoders

Neural Information Processing Systems

Inference networks of traditional Variational Autoencoders (VAEs) are typically amortized, resulting in relatively inaccurate posterior approximation compared to instance-wise variational optimization. Recent semi-amortized approaches were proposed to address this drawback; however, their iterative gradient update procedures can be computationally demanding. In this paper, we consider a different approach of building a mixture inference model. We propose a novel recursive mixture estimation algorithm for VAEs that iteratively augments the current mixture with new components so as to maximally reduce the divergence between the variational and the true posteriors. Using the functional gradient approach, we devise an intuitive learning criteria for selecting a new mixture component: the new component has to improve the data likelihood (lower bound) and, at the same time, be as divergent from the current mixture distribution as possible, thus increasing representational diversity. Although there have been similar approaches recently, termed boosted variational inference (BVI), our methods differ from BVI in several aspects, most notably that ours deal with recursive inference in VAEs in the form of amortized inference, while BVI is developed within the standard VI framework, leading to a non-amortized single optimization instance, inappropriate for VAEs. A crucial benefit of our approach is that the inference at test time needs a single feed-forward pass through the mixture inference network, making it significantly faster than the semi-amortized approaches. We show that our approach yields higher test data likelihood than the state-of-the-arts on several benchmark datasets.


Review for NeurIPS paper: Recursive Inference for Variational Autoencoders

Neural Information Processing Systems

Strengths: Soundness: The theoretical grounding and empirical evaluation are largely sound. The authors derive their technique in Section 3 and show how this naturally results in an objective for the current component that trades off between the ELBO and a KL from the current approximate posterior distribution. The authors compare their approach against a comprehensive set of baselines, including a standard VAE, semi-amortized VI (which also involves additional computation), normalizing flows (which uses a more expressive distribution), and a non-recursive mixture distribution (which has the same form of distribution). This comparison is performed across a range of image datasets and multiple model sizes using multiple runs. The authors also compare their approach with boosted VI, showing that the KL, rather than entropy, is useful for mixture estimation. Significance: The proposed approach outperforms more expressive VI techniques, particularly IAF.


Review for NeurIPS paper: Recursive Inference for Variational Autoencoders

Neural Information Processing Systems

Improving inference in VAEs is an area of wide interest and potentially high impact. The reviewers thought the paper was mostly well written and the approach sensible. The experimental results are encouraging, with RMIM outperforming quite a few baselines, though there were some potential issues with the experimental setup as explained below. The paper has substantial room for improvement however, with the reviewers making several good suggestions. For example, the derivation of the objective needs to be made clearer, including an explanation of how exactly it differs from the derivation in the BVI paper.


Recursive Inference for Variational Autoencoders

Neural Information Processing Systems

Inference networks of traditional Variational Autoencoders (VAEs) are typically amortized, resulting in relatively inaccurate posterior approximation compared to instance-wise variational optimization. Recent semi-amortized approaches were proposed to address this drawback; however, their iterative gradient update procedures can be computationally demanding. In this paper, we consider a different approach of building a mixture inference model. We propose a novel recursive mixture estimation algorithm for VAEs that iteratively augments the current mixture with new components so as to maximally reduce the divergence between the variational and the true posteriors. Using the functional gradient approach, we devise an intuitive learning criteria for selecting a new mixture component: the new component has to improve the data likelihood (lower bound) and, at the same time, be as divergent from the current mixture distribution as possible, thus increasing representational diversity.


Report 84 06 Controlling Recursive Inference . S Stanford David E. Smith Michael R. Matthew L. Ginsberg a

AI Classics

Loosely speaking, recursive inference is when an inference procedure generates an infinite sequence of similar subgoals. In general, the control of recursive inference involves demonstrating that recursive portions of a search space will not contribute any new answers to the problem beyond a certain level. We first review a well known syntactic method for controlling repeating inference (inference where the conjuncts processed are instances of their ruicestors), provide a proof that it is correct, and discuss the con- (Mims under which the strategy is optimal. We also derive more powerful pruning theorems for rases involving transitivity axioms arid cases involving subsumed subgoals. The treatment of repeating inference is followed by consideration of the More difficult prr)liIon of recursive inference Crat does not repeat. Here we show bow knowledge of the properties of the relations involved and knowledge about the contents of the system's database can be used to prove that portions of a search space will not contribute any new .az


A Calculus for Causal Relevance

Bonet, Blai

arXiv.org Artificial Intelligence

This paper presents a sound and completecalculus for causal relevance, based onPearl's functional models semantics.The calculus consists of axioms and rulesof inference for reasoning about causalrelevance relationships.We extend the set of known axioms for causalrelevance with three new axioms, andintroduce two new rules of inference forreasoning about specific subclasses ofmodels.These subclasses give a more refinedcharacterization of causal models than the one given in Halpern's axiomatizationof counterfactual reasoning.Finally, we show how the calculus for causalrelevance can be used in the task ofidentifying causal structure from non-observational data.